- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 366
 
add benchmarks using pytest-benchmark and codspeed #3562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| 
           @zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it?  | 
    
          
 done  | 
    
| 
           does anyone have opinions about benchmarks? feel free to suggest something concrete. Otherwise, I think we should take this as-is and deal with later benchmarks (like partial shard read / writes) in a subsequent pr  | 
    
        
          
                .github/workflows/codspeed.yml
              
                Outdated
          
        
      | uses: CodSpeedHQ/action@v4 | ||
| with: | ||
| mode: instrumentation | ||
| run: hatch run test.py3.11-1.26-minimal:run-benchmark | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we test the latest instead? seems more appropriate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest version of python? What's the reasoning? I'd rather update this file when we drop a supported version vs when a new version of python comes out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we'd want to catch a perf regression from upstream changes too? I'm suggested latest version of released libraries py=3.13, np=2.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have an upper bound on numpy versions, so I don't think this particular workflow will help us catch regressions from upstream changes -- we would need to update this workflow every time a new version of numpy is released. IMO that's something we should do in a separate benchmark workflow. This workflow here will run on every PR, and in that case the oldest version of numpy we support seems better.
we also don't have to use a pre-baked hatch environment here, we could define a dependency set specific to benchmarking. but my feeling is that benchmarking against older versions of stuff gives us a better measure of what users will actually experience.
          
 indexing please. that'll exercise the codec pipeline too. a peakmem metric would be good to track also, if possible.  | 
    
          
 I don't think codspeed or pytest-benchmark do memory profiling. we would need https://pytest-memray.readthedocs.io/en/latest/ or something equivalent for that. and an indexing benchmark sounds like a great idea but I don't think I have the bandwidth for it in this pr right now  | 
    
…into chore/benchmarks
…into chore/benchmarks
| 
           I added a benchmark that clearly reveals the performance improvement of #3561  | 
    
| 
           I added some slice-based benchmarks based on the examples from #3524, and I updated the contributing docs with a section about the benchmarks. assuming we can resolve the discussion about which python / numpy version to use in the CI job, I think this is ready  | 
    
| 
           new problem: the codspeed CI benchmarks are way too slow! the benchmark suite runs in 90s locally, and It's taking over 40m to run in CI. Help would be appreciated in speeding this up.  | 
    
| 
           owing to the large number of syscalls in our benchmark code, codspeed recommended using the walltime instrument instead of their virtual CPU instrument. But to turn on the walltime benchmark, we would need to run our benchmarking code on codspeed's servers, which is a security risk. Given that codspeed is not turning out to be particularly simple, I am inclined to defer the codspeed CI stuff for later work. But if someone can help get the test runtime down, and / or we are OK running our benchmarks on codspeed's servers, then maybe we can get that sorted in this PR.  | 
    
since #3554 was an unpopular direction I'm going instead with codspeed + pytest-benchmark. Opening as a draft because I haven't looked into how codspeed works at all, but I'd like people to weigh in on whether these initial benchmarks make sense. Naturally we can add more specific ones later, but I figured just some bulk array read / write workloads would be a good start.